Open science and reusability in political science

Martin Søyland


University of Oslo

Open science

  • Open access to research (UNESCO) …
    • … to all levels of society
    • … at all stages of research
    • See also Kitchin (2014)
  • Reproducibility, replicability (Dreber and Johannesson 2024), and reusability

Reproducibility



flowchart LR
    
    REPROD[Reproducibility]
    
    REPROD --- COMP{{Computational}}
    REPROD --- REC{{Recreate}}
    REPROD --- ROB{{Robust}}
    
    COMP --> COMPDESC("Original code 
                       and data")
                       
    REC ---> REC1("Access to original 
                  data, but not code")
    REC --> REC2("Access to original 
                  code, but not data")
    REC ---> REC3("Access to neither
                  code nor data")

    ROB --> ROBDESC("Alternative
                     analytical tools")
                     
classDef level1 font-size:32pt, font-weight: bold
classDef level2 font-size:28pt, font-weight: bold
classDef level3 font-size:20pt

class REPROD level2

class DIR level2
class CONC level2
class DES level2
class EME level2
class COMP level2
class REC level2
class ROB level2

class COMPDESC level3
class REC1 level3
class REC2 level3
class REC3 level3
class ROBDESC level3

Replicability


flowchart LR

  REPLI[Replication]
  REPLI --- DIR{{Direct}}
  REPLI --- CONC{{Conceptual}}
    
                       
  DIR --> DIR1("Same design/analysis, 
                  but new data and
                  same population")
  DIR ---> DIR2("Same design/analysis, 
                  but new data and
                  similar population")
  DIR --> DIR3("Same design/analysis, 
                  but new data and
                  different population")
    
  CONC ---> CONC1("Different design/analysis, 
                    new data and
                    same population")
  CONC --> CONC2("Different design/analysis, 
                    new data and
                    similar population")
  CONC ---> CONC3("Different design/analysis,
                    new data and
                    different population")

classDef level1 font-size:32pt, font-weight: bold
classDef level2 font-size:28pt, font-weight: bold
classDef level3 font-size:18pt

class REPLI level1

class DIR level2
class CONC level2

class CONC1 level3
class CONC2 level3
class CONC3 level3

class DIR1 level3
class DIR2 level3
class DIR3 level3

    

Reusability

Reusability is the degree to which research can be used outside the scope of the original research’s purpose entirely (Thanos 2017)

  • By design
    • Data generated with the main purpose of being reused
    • ANES, V-Dem, PAIRDEM, data papers etc
  • Emergent
    • Data reused as a byproduct of concientious data gathering
    • Replication material, shared data, etc

Why reusability?

  1. Resource efficiency
    • Replication material often does not live up to expectations
    • Gather from scratch or give up?
  2. Data validity
    • More reuse, more validation of data
  3. Provide snapshots of history
    • Data change over time
    • Interpretation change over time

5 principles of reusability



  1. 🤔 Data inclusion, despite irrelevance or missing parts
  2. 📝 Documentation of data gathering
  3. 🌳 Disaggregation of the units of analysis
  4. 🔣 Retain raw data copies
  5. 🤝 Consider data merging potential

Case: Reusability with dynamic data sources

  • Dynamic data is data that can be reproduced, appended, mended, and extended
  • Scraping or crawling
  • Front-end or back-end
  • Data packages

Examples of data packages for R

Dataset Reference Scope Downloads Citations
congress Kenny (2024) Library of Congress API ~ 14k
essurvey Cimentada (2019) European Social Survey rounds ~ 41k 8
eurlex Ovadek (2021) European Union laws ~ 22k 17
hansard Odell (2017) UK Parliament API ~ 46k 7
manifestoR Lewandowski, Merz, and Regel (2020) Manifesto Project data ~ 60k 13
wbstats Piburn (2020) World Bank API ~ 266k 22
WDI Arel-Bundock (2022) World Development Indicators ~ 568k 49

Applied examples: stortingscrape

Two examples – Private Member Bills

H1: Opposition MPs use representative proposals strategically to set the agenda for upcoming elections

H2: MPs use PMBs more for inter-party cooperation under minority governments than majority governments

Topics and Elections

Co-operation and Government Types

Co-operation and Government Types

2005-2009

2009-2013

2013-2017

Conclusion

  • Open science
    • Reproducibility and replicability
    • Reusability
  • 5 principles of reusability
    1. data inclusion
    2. documentation
    3. disaggregation
    4. retain raw data
    5. consider merging potential
  • stortingscrape
    • dynamic data and reusability
    • PMBs – topics and elections
    • PMBs – Co-operation and government types

References

Arel-Bundock, Vincent. 2022. WDI: World Development Indicators and Other World Bank Data. https://CRAN.R-project.org/package=WDI.
Cimentada, Jorge. 2019. Download Data from the European Social Survey on the Fly. https://docs.ropensci.org/essurvey/.
Dreber, Anna, and Magnus Johannesson. 2024. “A Framework for Evaluating Reproducibility and Replicability in Economics.” Economic Inquiry, July. https://doi.org/10.1111/ecin.13244.
Kenny, Christopher T. 2024. Congress: Access the Congress.gov API. https://CRAN.R-project.org/package=congress.
Kitchin, Rob. 2014. The Data Revolution: Big Data, Open Data, Data Infrastructures & Their Consequences. Sage.
Lewandowski, Jirka, Nicolas Merz, and Sven Regel. 2020. manifestoR: Access and Process Data and Documents of the Manifesto Project. https://CRAN.R-project.org/package=manifestoR.
Odell, Evan. 2017. hansard: Provides Easy Downloading Capabilities for the UK Parliament API. https://doi.org/10.5281/zenodo.591264.
Ovadek, Michal. 2021. “Facilitating Access to Data on European Union Laws.” Political Research Exchange 3 (1): 1870150. https://doi.org/10.1080/2474736X.2020.1870150.
Piburn, Jesse. 2020. Wbstats: Programmatic Access to the World Bank API. Oak Ridge, Tennessee: Oak Ridge National Laboratory. https://doi.org/10.11578/dc.20171025.1827.
Thanos, Costantino. 2017. Research Data Reusability: Conceptual Foundations, Barriers and Enabling Technologies.” Publications 5 (1): 2–19. https://doi.org/10.3390/publications5010002.